Skip to content

Conversation

@xuzhao9
Copy link
Contributor

@xuzhao9 xuzhao9 commented Dec 15, 2025

On AMD, we observed high variance when benchmarking in dind: meta-pytorch/tritonbench#726

On NVIDIA B200 runner, we observed high variance when benchmarking with multiple CPU cores: #130

We are pinning the job to single CPU thread to stabilize the benchmark result.

Test plan:
https://github.com/pytorch/pytorch-integration-testing/actions/runs/20983664412

@xuzhao9 xuzhao9 changed the title [wip][tritonbench] Test benchmarking without docker [tritonbench] Benchmarking without docker with single CPU thread Jan 14, 2026
@xuzhao9 xuzhao9 mentioned this pull request Jan 14, 2026
@xuzhao9 xuzhao9 requested a review from huydhn January 14, 2026 16:12
Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@huydhn
Copy link
Contributor

huydhn commented Jan 14, 2026

Just FYI, with the docker-in-docker setup for multi-tenancy, here is the env the CI is using without using a Docker image https://github.com/meta-pytorch/pytorch-gha-infra/blob/main/multi-tenant/images/multi-tenant-gpu/Dockerfile

@xuzhao9
Copy link
Contributor Author

xuzhao9 commented Jan 14, 2026

Yeah we met a few issues with dind on AMD, plus it seems --cpuset-cpus doesn't work in dind. We decide to move to non-dind for now.

@xuzhao9 xuzhao9 merged commit 3e0093a into main Jan 14, 2026
1 check passed
@xuzhao9 xuzhao9 deleted the xz9/disable-dind branch January 15, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants